A new VAD framework using statistical model and human knowledge based empirical rule

نویسندگان

  • Ji Wu
  • Xiao-Lei Zhang
  • Wei Li
چکیده

This paper presents a new voice activity detection (VAD) framework that is based on the empirical rules and statistical models. First, the VAD framework detects the candidate endpoints efficiently in the time domain with empirical rules which are based on the human knowledge and the nature of the speech continuousness, and then it confirms the candidate endpoints in the transform domain with different confirmation schemes for beginning-point and ending-point. Particularly in the transform domain, a new algorithm called sliding-window double-layer confirmation (SWDC) is proposed and employed to confirm the endpoint accurately, and sensitive data, which is used for GMM training, are proposed for our detection scheme. The experiments show that the proposed VAD framework achieves better performances in various environmental conditions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An efficient voice activity detection algorithm by combining statistical model and energy detection

In this article, we present a new voice activity detection (VAD) algorithm that is based on statistical models and empirical rule-based energy detection algorithm. Specifically, it needs two steps to separate speech segments from background noise. For the first step, the VAD detects possible speech endpoints efficiently using the empirical rulebased energy detection algorithm. However, the poss...

متن کامل

A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)

Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...

متن کامل

Optimal Policy Rules for Iran in a DSGE Framework (Islamic Musharakah Approach)

The aim of this paper is determination of an optimal policy rule for Iranian economy from an Islamic perspective. This study draws on an Islamic instrument known as the Musharakah contract to design a dynamic stochastic general equilibrium model. In this model the interest rate is no longer considered as a monetary policy instrument and the focus is on the impact of economic shocks on the Dynam...

متن کامل

Endpoint detection using weighted finite state transducer

In this paper, we discuss the possibility of applying weighted finite state transducer (WFST) as a unified framework to solve endpoint detection problem. In general, endpoint detection is composed of two cascaded decision processes. The first process is voice activity detection (VAD) which makes framelevel speech/non-speech classification. The second process is utterance-level detection which m...

متن کامل

Voice activity detection based on statistical models and machine learning approaches

The voice activity detectors (VADs) based on statistical models have shown impressive performances especially when fairly precise statistical models are employed. Moreover, the accuracy of the VAD utilizing statistical models can be significantly improved when machine-learning techniques are adopted to provide prior knowledge for speech characteristics. In the first part of this paper, we intro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010